AITopics | asymptotic performance

f7d3cef7ff579f2f903c8f458e730cae-Paper-Conference.pdf

Neural Information Processing SystemsApr-30-2026, 08:38:51 GMT

artificial intelligence, machine learning, subtask, (14 more...)

Neural Information Processing Systems

Country:

Asia > China (0.47)
Europe (0.46)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models

Neural Information Processing SystemsMar-16-2026, 19:55:21 GMT

Model-based reinforcement learning (RL) algorithms can attain excellent sample efficiency, but often lag behind the best model-free algorithms in terms of asymptotic performance. This is especially true with high-capacity parametric function approximators, such as deep networks. In this paper, we study how to bridge this gap, by employing uncertainty-aware dynamics models. We propose a new algorithm called probabilistic ensembles with trajectory sampling (PETS) that combines uncertainty-aware deep network dynamics models with sampling-based uncertainty propagation. Our comparison to state-of-the-art model-based and model-free deep RL algorithms shows that our approach matches the asymptotic performance of model-free algorithms on several challenging benchmark tasks, while requiring significantly fewer samples (e.g. 8 and 125 times fewer samples than Soft Actor Critic and Proximal Policy Optimization respectively on the half-cheetah task).

artificial intelligence, machine learning, reinforcement learning, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

f7d3cef7ff579f2f903c8f458e730cae-Paper-Conference.pdf

Neural Information Processing SystemsFeb-18-2026, 01:00:19 GMT

artificial intelligence, machine learning, subtask, (14 more...)

Neural Information Processing Systems

Country:

Asia > China > Beijing > Beijing (0.04)
Asia > China > Shanghai > Shanghai (0.04)
Europe > Portugal > Porto > Porto (0.04)
(3 more...)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Distributed Multitask Reinforcement Learning with Quadratic Convergence

Rasul Tutunov, Dongho Kim, Haitham Bou Ammar

Neural Information Processing SystemsFeb-13-2026, 11:35:59 GMT

Multitask reinforcement learning (MTRL) suffers from scalability issues when the number of tasks or trajectories grows large.

machine learning, reinforcement learning, trajectory, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > Quebec > Montreal (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.50)

Add feedback

Country:

North America > Canada > Quebec > Montreal (0.04)
Europe > Portugal (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
(2 more...)

Add feedback

08058bf500242562c0d031ff830ad094-Supplemental.pdf

Neural Information Processing SystemsFeb-7-2026, 09:35:27 GMT

environment step, objective, sequence, (14 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.49)

Add feedback

Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models

Neural Information Processing SystemsNov-20-2025, 22:06:17 GMT

Model-based reinforcement learning (RL) algorithms can attain excellent sample efficiency, but often lag behind the best model-free algorithms in terms of asymptotic performance. This is especially true with high-capacity parametric function approximators, such as deep networks. In this paper, we study how to bridge this gap, by employing uncertainty-aware dynamics models. We propose a new algorithm called probabilistic ensembles with trajectory sampling (PETS) that combines uncertainty-aware deep network dynamics models with sampling-based uncertainty propagation. Our comparison to state-of-the-art model-based and model-free deep RL algorithms shows that our approach matches the asymptotic performance of model-free algorithms on several challenging benchmark tasks, while requiring significantly fewer samples (e.g. 8 and 125 times fewer samples than Soft Actor Critic and Proximal Policy Optimization respectively on the half-cheetah task).

deep reinforcement learning, name change, probabilistic dynamic model, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Provably Optimal Reinforcement Learning under Safety Filtering

Oh, Donggeon David, Nguyen, Duy P., Hu, Haimin, Fisac, Jaime F.

arXiv.org Artificial IntelligenceOct-22-2025

Recent advances in reinforcement learning (RL) enable its use on increasingly complex tasks, but the lack of formal safety guarantees still limits its application in safety-critical settings. A common practical approach is to augment the RL policy with a safety filter that overrides unsafe actions to prevent failures during both training and deployment. However, safety filtering is often perceived as sacrificing performance and hindering the learning process. We show that this perceived safety-performance tradeoff is not inherent and prove, for the first time, that enforcing safety with a sufficiently permissive safety filter does not degrade asymptotic performance. We formalize RL safety with a safety-critical Markov decision process (SC-MDP), which requires categorical, rather than high-probability, avoidance of catastrophic failure states. Additionally, we define an associated filtered MDP in which all actions result in safe effects, thanks to a safety filter that is considered to be a part of the environment. Our main theorem establishes that (i) learning in the filtered MDP is safe categorically, (ii) standard RL convergence carries over to the filtered MDP, and (iii) any policy that is optimal in the filtered MDP-when executed through the same filter-achieves the same asymptotic return as the best safe policy in the SC-MDP, yielding a complete separation between safety enforcement and performance optimization. We validate the theory on Safety Gymnasium with representative tasks and constraints, observing zero violations during training and final performance matching or exceeding unfiltered baselines. Together, these results shed light on a long-standing question in safety-filtered learning and provide a simple, principled recipe for safe RL: train and deploy RL policies with the most permissive safety filter that is available.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2510.18082

Genre: Research Report (1.00)

Industry: Transportation (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Filters

Collaborating Authors

asymptotic performance

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

f7d3cef7ff579f2f903c8f458e730cae-Paper-Conference.pdf

Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models

f7d3cef7ff579f2f903c8f458e730cae-Paper-Conference.pdf

Distributed Multitask Reinforcement Learning with Quadratic Convergence

f600d1a3f6a63f782680031f3ce241a7-Supplemental-Conference.pdf

21fe5b8ba755eeaece7a450849876228-AuthorFeedback.pdf

52e431bd7689d98426300cb103bb0ee3-Paper-Conference.pdf

08058bf500242562c0d031ff830ad094-Supplemental.pdf

Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models

Provably Optimal Reinforcement Learning under Safety Filtering